\chapter{\sysname{}-specific style guide}
\label{sicl-specific-style-guide}

\section{Commenting}

In most programs, comments introduce unnecessary redundancies that can
then easily get out of sync with the code.  This is less risky for an
implementation of a specification that is not likely to change.
Furthermore, we would like \sysname{} to be not only a high-quality
implementation, but we would like for its code to be very readable.
For that reason, we think it is preferable to write \sysname{} in a
``literate programming'' style, with significant comments explaining
the code.

Accordingly, we prefer comments to consist of complete sentences,
starting with a capital letter, and ending with punctuation.

\section{Designators for symbol names}

Always use uninterned symbols (such as \texttt{\#:hello}) whenever a
string designator for a symbol name is called for.  In particular,
this is useful in \texttt{defpackage} and \texttt{in-package} forms.

Using the upper-case equivalent string makes the code break whenever
the reader is case-sensitive (and it looks strange that the designator
has a different case from the way symbol that it designates is then
used).

Using interned symbols sends the wrong message to the person reading
the code, namely that the package of the symbol has some
significance.  Using an uninterned symbol immediately tells the
person reading the code that the package is not being used.

\section{Docstrings}

We believe that it is a bad idea for an implementation of a Lisp
system to have docstrings in the same place as the definition of the
language item that is documented, for several reasons.  First, to the
person reading the code, the docstring is most often noise, because it
is known from the standard what the language item is about.  Second,
it often looks ugly with multiple lines starting in column 1 of the
source file, and this fact often discourages the programmer from
providing good docstring.  Third, it makes internationalization
harder.

For this reason, we will provide language-specific files containing
all docstrings of Common Lisp in the form of calls to \texttt{(setf
documentation)}.

We also recommend using \texttt{format} (at read time) so that the
format directive \texttt{\~{}@} can be used at the end of lines,
allowing the following line to be indented as the rest of the text.
That way, we avoid the ugliness of having subsequent lines start in
column 1.

\section{Naming and use of slots}

In order to make the code as safe as possible, we typically do not
want to export the name of a slot, whereas frequently, the reader or
the accessor of that slot should be exported.  This restriction
implies that a slot and its corresponding reader or accessor cannot
have the same name.  Several solutions exist to this problem.  The one
we are using for \sysname{} is to have slot names start with the
percent character (`\texttt{\%}').  Traditionally, a percent character
has been used to indicate some kind of danger, i.e., that the
programmer should be very careful before directly using such a name.
Client code that attempts to use such a slot would have to write
\texttt{package::\%name} which contains two indicators of danger,
namely the double colon package marker and the percent character.

Code should refer to slot names directly as little as possible.  Even
code that is private to a package should use an internal protocol in
the form of readers and writers, and such protocols should be
documented and exported whenever reasonable.

\section{Standard functions}

Standard functions should always check the validity of their arguments
and of any other aspect of the environment.  If such a function fails
to accomplish its task, it should signal an appropriate condition.

We would like for errors to be signaled by the code that was directly
invoked by user code, as opposed to in terms of code that was
indirectly invoked by system code.  As an example, consider a sequence
function such as \texttt{substitute}.  If it is detected that a dotted
list has been passed to this function, it should not be reported by
\texttt{endp} or any other system function that was not directly
called by user code, but instead it should be reported by
\texttt{substitute}.  On the other hand, if substitute invokes a
user-supplied test that fails, we would like the error message to be
reported in terms of that user-supplied code rather than by
\texttt{substitute}.  This is how we are currently imagining solving
this problem:

\begin{itemize}
\item Standard functions do not call any other standard functions
  directly, other than if it is known that no error will be signaled.
  When a call from a standard function $f$ to a standard function $g$
  might result in an error being signaled by $g$, that call is
  replaced by a call to a special version of the standard function,
  say $h$ that signals a more specific condition than is dictated by
  the Common Lisp standard.
\item If acceptable in terms of performance, a standard function such
  as $f$ that calls other functions that may signal an error, handles
  such errors by signaling an error that is directly related to $f$.

\end{itemize}

\section{Standard macros}

Standard macros must do extensive syntax analysis on their input so as
to avoid compilation errors that are phrased in terms of expanded
code.

As with standard functions, standard macros that expand into other
system code that may signal an error should not use other standard
functions or other standard macros directly, but instead special
versions that signal more specific conditions.  The expanded code
should then contain a handler for such errors, which signals an error
in terms of the name and the arguments of the macro.

\section{Compiler macros}

{\sysname} will make use of compiler macros for standard functions
when doing so seems reasonable, and when some performance improvement
is likely.  Compiler macros are part of the standard, so this
mechanism must be part of a conforming compiler anyway.  In some
cases, instead of encoding special knowledge in the compiler itself,
we can use compiler macros.  By doing it this way, we simplify the
compiler, and we provide a set of completely portable macros that any
implementation can use.

However, in \sysname{} we can use \emph{call-site optimization} to
optimize function calls in many situations where a typical
\commonlisp{} implementation would have to resort to compiler macros.
In particular, call-site optimization can usually optimize calls to
functions with optional and/or keyword parameters.

For cases where call-site optimization is not likely to be able to
handle the situation, compiler macros should be used whenever the
exact shape of the call site might be used to improve performance of
the callee. In particular, for functions that take a \texttt{\&rest}
argument, we can provide special cases for different common sizes of
the \texttt{\&rest} argument.  For example, it seems reasonable to
define a compiler macro for \texttt{list*} that converts the call to
nested calls to \texttt{cons}.

\section{Conditions and restarts}

\sysname{} functions should signal conditions whenever this is
required by the Lisp standard (of course) and whenever it is
\emph{allowed} by the Lisp standard and reasonably efficient to do so.
If the standard allows for subclasses of indicated signals (I think
this is the case), then \sysname{} should generate as specific a
condition as possible, and the conditions should contain all available
information as possible in order reduce the required effort to
find out where the problem is located.

\sysname{} function should also provide restarts whenever this is
practical.

\section{Condition reporting}

Condition reporting should be separate from the definition of the
condition itself.  Separating the two will make it easier to customize
condition reporting for different languages and for different
systems.  An integrated development environment might provide
different condition reporters from the normal ones, that in addition
to reporting a condition, displays the source-code location of the
problem.

Every \sysname{} module will supply a set of default condition
reporters for all the specific conditions defined in that module.
Those condition reporters will use plain English text.

\section{Internationalization}

We would like for {\sysname} to have the ability to report messages in
the local language if desired.  The way we would like to do that is to
have it report conditions according to a \texttt{language} object.  To
accomplish this, condition reporting trampolines to an
implementation-specific function \texttt{sicl:report-condition} which
takes the condition, a stream, and a language as arguments.

The value of the special variable \texttt{sicl:*language*} is passed
by the condition-reporting function to \texttt{sicl:report-condition}.

In other words, the default \texttt{:report} function for conditions is:

\begin{verbatim}
   (lambda (condition stream)
     (sicl:report-condition condition stream sicl:*language*))
\end{verbatim}

Similarly, the Common Lisp function \texttt{documentation} should
trampoline to a function that uses the value of
\texttt{sicl:*language*} to determine which language to use to show
the documentation.

\section{Package structure}

{\sysname} has a main package containing and exporting all Common Lisp
symbols.  It contains no other symbols.  A number of implementation
packages import the symbols from this package, and might define
internal symbols as well.  Implementation packages may export symbols
to be used by other implementation packages.

This package structure allows us to isolate implementation-dependent
symbols in different packages.

\section{Assertions}

\section{Threading and thread safety}

Consider the use of locks to be free.  We predict that a technique
call ``speculative lock elision'' will soon be available in all main
processors.

\section{Extensible Libraries}

During the development on \sysname{}, we have come up with a
widely applicable technique for defining extensible libraries.  Such an
extensible library is defined around a set of generic functions that take a
client object as their first argument.  This client object is forwarded by
all internal calls of the library, such that all consecutive calls within
that library receive the same first argument.  The default behavior of the
library is defined by methods whose first parameter specializes only to the
class \texttt{T}.

Whenever a client calls a generic function of the library, it must supply a
client object that can be clearly distinguished from client objects of any
other client.  Ideally, the client object should be an instance of a class
that is defined by that client, but other variants like using a symbol
whose name relates to the client are also possible.  The client can then
extend the behavior of the library by defining additional methods
specialized to the client object or its class.

Designing a library in this way has a number of benefits:

\begin{itemize}
\item Clients can introduce very deep changes to the original behavior of
  the library, including use of some entirely different data structures, or
  additional \texttt{:before}, \texttt{:after}, and \texttt{:around}
  methods.
\item The library can intentionally leave some parts of its behavior
  undefined, and rely on clients to fill in this information.
\item The library can be used by multiple clients simultaneously, even in
  the case where the types of some of the involved data structures alone
  are not specific enough for discrimination.  (For example, imagine two
  clients for a library that operates on multisets.  The first client might
  represent these sets as a list of objects, and the second library might
  represent these sets as an alist whose keys are objects and whose values
  are their count.  Without the client argument, the generic functions for
  working with such multisets wouldn't be able to detect whether they are
  used by the first client, or by the second client.)
\item If a client defines its own class of client objects and supplies an
  instance of that class as the client argument of an extensible library,
  it is possible to customize the client behavior further by means of
  inheritance, e.g., to extract the common behavior of several clients into
  a base class, to mix the behavior of several clients, or to introduce
  minor modifications to an existing client.
\item Clients can add additional information or even mutable state to the
  client object.  This state is then automatically available to all methods
  applicable to this client object.  Introducing state this way can avoid
  what would otherwise have been a global data structure.
\end{itemize}

We use this technique pervasively in {\sysname}.  The result is that the
{\sysname} codebase itself contains just a few client classes and methods,
while the bulk of the functionality is available as independent, extensible
libraries.  Examples of such libraries are Eclector, the portable Common
Lisp reader, or Clostrum, an implementation of first-class global
environments.
